Locality sensitive semi-supervised feature selection

نویسندگان

  • Jidong Zhao
  • Ke Lu
  • Xiaofei He
چکیده

In many computer vision tasks like face recognition and image retrieval, one is often confronted with high-dimensional data. Procedures that are analytically or computationally manageable in low-dimensional spaces can become completely impractical in a space of several hundreds or thousands dimensions. Thus, various techniques have been developed for reducing the dimensionality of the feature space in the hope of obtaining a more manageable problem. The most popular feature selection and extraction techniques include Fisher score, Principal Component Analysis (PCA), and Laplacian score. Among them, PCA and Laplacian score are unsupervised methods, while Fisher score is supervised method. None of them can take advantage of both labeled and unlabeled data points. In this paper, we introduce a novel semi-supervised feature selection algorithm, which makes use of both labeled and unlabeled data points. Specifically, the labeled points are used to maximize the margin between data points from different classes, while the unlabeled points are used to discover the geometrical structure of the data space. We compare our proposed algorithm with Fisher score and Laplacian score on face recognition. Experimental results demonstrate the efficiency and effectiveness of our algorithm. r 2008 Published by Elsevier B.V.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semi-Supervised Feature Selection with Constraint Sets

In machine learning classification and recognition are crucial tasks. Any object is recognized with the help of features associated with it. Among many features only some leads to classify object correctly. Feature selection is useful technique to detect such specific features. Feature selection is a process of selecting subset of features to reduce number of features (dimensionality reduction)...

متن کامل

Locality sensitive hashing for fast computation of correlational manifold learning based feature space transformations

Manifold learning based techniques have been found to be useful for feature space transformations and semi-supervised learning in speech processing. However, the immense computational requirements in building neighborhood graphs have hindered the application of these techniques to large speech corpora. This paper presents an approach for fast computation of neighborhood graphs in the context of...

متن کامل

Estimation of Individual Micro Data from Aggregated Open Data

In this paper, we propose a method of estimating individual micro data from aggregated open data based on semisupervised learning and conditional probability. Firstly, the proposed method collects aggregated open data and support data, which are related to the individual micro data to be estimated. Then, we perform the locality sensitive hashing (LSH) algorithm to find a subset of the support d...

متن کامل

Graph Laplacian for Semi-supervised Feature Selection in Regression Problems

Feature selection is fundamental in many data mining or machine learning applications. Most of the algorithms proposed for this task make the assumption that the data are either supervised or unsupervised, while in practice supervised and unsupervised samples are often simultaneously available. Semi-supervised feature selection is thus needed, and has been studied quite intensively these past f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Neurocomputing

دوره 71  شماره 

صفحات  -

تاریخ انتشار 2008